Introduction

This dataset contains a list of video games with sales greater than 100,000 copies. It was generated by a scrape of vgchartz.com.

Fields include

  1. Rank - Ranking of overall sales
  2. Name - The games name
  3. Platform - Platform of the games release (i.e. PC,PS4, etc.)
  4. Year - Year of the game’s release
  5. Genre - Genre of the game
  6. Publisher - Publisher of the game
  7. NA_Sales - Sales in North America (in millions)
  8. EU_Sales - Sales in Europe (in millions)
  9. JP_Sales - Sales in Japan (in millions)
  10. Other_Sales - Sales in the rest of the world (in millions)
  11. Global_Sales - Total worldwide sales.
  12. The script to scrape the data is available at https://github.com/GregorUT/vgchartzScrape. It is based on
  13. BeautifulSoup using Python. There are 16,598 records. 2 records were dropped due to incomplete information.
data = read.csv("vgsales.csv", header = TRUE)
str(data)
## 'data.frame':    16598 obs. of  11 variables:
##  $ Rank        : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Name        : Factor w/ 11493 levels "'98 Koshien",..: 10991 9343 5532 10993 7370 9707 6648 10989 6651 2594 ...
##  $ Platform    : Factor w/ 31 levels "2600","3DO","3DS",..: 26 12 26 26 6 6 5 26 26 12 ...
##  $ Year        : Factor w/ 40 levels "1980","1981",..: 27 6 29 30 17 10 27 27 30 5 ...
##  $ Genre       : Factor w/ 12 levels "Action","Adventure",..: 11 5 7 11 8 6 5 4 5 9 ...
##  $ Publisher   : Factor w/ 579 levels "10TACLE Studios",..: 369 369 369 369 369 369 369 369 369 369 ...
##  $ NA_Sales    : num  41.5 29.1 15.8 15.8 11.3 ...
##  $ EU_Sales    : num  29.02 3.58 12.88 11.01 8.89 ...
##  $ JP_Sales    : num  3.77 6.81 3.79 3.28 10.22 ...
##  $ Other_Sales : num  8.46 0.77 3.31 2.96 1 0.58 2.9 2.85 2.26 0.47 ...
##  $ Global_Sales: num  82.7 40.2 35.8 33 31.4 ...

Data Visualizations

  1. Sales In Data Platform:
  1. Genre Popularity over the Years: